Web Catchphrase Improve System Employing Onomatopoeia and Large-Scale N-gram Corpus
نویسندگان
چکیده
منابع مشابه
Web Catchphrase Improve System Employing Onomatopoeia and Large-Scale N-gram Corpus
In this paper, we propose a system which improves text catchphrases on the web using onomatopoeia and the Japanese Google N-grams. Onomatopoeia is regarded as a fundamental tool in daily communication for people. The proposed system inserts an onomatopoetic word into plain text catchphrases. Being based on a large catchphrase encyclopedia, the proposed system evaluates each catchphrase’s candid...
متن کاملAn Overview of Microsoft Web N-gram Corpus and Applications
This document describes the properties and some applications of the Microsoft Web Ngram corpus. The corpus is designed to have the following characteristics. First, in contrast to static data distribution of previous corpus releases, this N-gram corpus is made publicly available as an XML Web Service so that it can be updated as deemed necessary by the user community to include new words and ph...
متن کاملDevelopment of a Web-Scale Chinese Word N-gram Corpus with Parts of Speech Information
Web provides a large-scale corpus for researchers to study the language usages in real world. Developing a web-scale corpus needs not only a lot of computation resources, but also great efforts to handle the large variations in the web texts, such as character encoding in processing Chinese web texts. In this paper, we aim to develop a web-scale Chinese word N-gram corpus with parts of speech i...
متن کاملSemantic Constraint and QoS-Aware Large-Scale Web Service Composition
Service-oriented architecture facilitates the running time of interactions by using business integration on the networks. Currently, web services are considered as the best option to provide Internet services. Due to an increasing number of Web users and the complexity of users’ queries, simple and atomic services are not able to meet the needs of users; and to provide complex services, it requ...
متن کاملSyntactic N-gram Collection from a Large-Scale Corpus of Internet Finnish
In this paper, we report on the development of a large-scale Finnish Internet parsebank, currently consisting of 1.5 billion tokens in 116 million sentences. The data is fully morphologically and syntactically analyzed and it has been used to extract flat and syntactic n-gram collections, as well as verb-argument and nounargument n-grams. Additionally, distributional vector space representation...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: International Journal of Fuzzy Logic and Intelligent Systems
سال: 2012
ISSN: 1598-2645
DOI: 10.5391/ijfis.2012.12.1.94